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PRODUCTION OF VANILLIN IN MICROBIAL CELLS 



Benefit is claimed of U.S. Provisional Application No. 60/412,649, filed October 23, 
2002, the entirety of which is incorporated by reference herein. 

FIELD OF THE INVENTION 

This invention relates to the field of microbial genetic engineering to produce high- 
value food and nutraceutical substances. In particular, this invention provides novel 
transgenic microbial cells that produce vanillin. 

BACKGROUND OF THE INVENTION 

Various patents and publications are referred to throughout the specification. Each of 
these is incorporated by reference herein in its entirety; 

Vanillin is the major principle flavor ingredient in vanilla extract and is also noted as 
a nutraceutical because of its anti-oxidant and antimicrobial properties. Vanillin can be used 
as a masking agent for undesirable flavors of other nutraceuticals. Vanilla extract is obtained 
fi-om cured vanilla beans, the bean-like pod produced by Vanilla planifolia, a tropical 
climbing orchid. 

Vanilla extract is widely used as a flavor by the food and beverage industry, and is 
used increasingly in perfumes. Becaxxse of the ever-increasing demand for natural food 
ingredients, natural vanilla extract produced jfrom vanilla beans is presently the most 
desirable fomi of vanilla. The areas of the world capable of supporting vanilla cultivation are 
limited, due to its reqmrement for a warm, moist and tropical climate with fi-equent, but not 
excessive, rain and moderate sunlight. The primary growmg region for vanilla is around the 
Indian Ocean, in Madagascar, Comoros, Reunion and Indonesia. 

The production of vanilla beans is a lengthy process that is highly dependent on 
suitable soil and weather conditions. Beans (pod-like fiuit) are produced after 4-5 years of 
cultivation. Flowers must be hand-pollinated, and finiit production takes about 8-10 months. 
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The characteristic flavor and aroma develops in the fruit after a process called "curing," 
lasting an additional 3-6 months. For a complete review of the vanilla growing and curing 
process, see D. Havkin Frenkel & R. Dom, "Vanilla," Chapter 4 in Spices: Flavor Chemistry 
and Antioxidant Properties. (Eds. Risch & Ho), American Chemical Society, Washington, 
1997. 

Vanillin is also produced chemically by molecular breakage of curcumin, eugenol or 
piperin. However, vanillin produced by this method can be labeled as a natural flavor only in 
non-vanilla flavors. Vanillin chemically synthesized from guaiacol is consumed at a rate of 
about 2,500 tons per year in the United States for the food and beverage industry. Though 
less expensive than natural vanillin, vanillin produced by chemical synthesis or breakage can 
be undesirable due to tlie market's current preference for natural food ingredients. 

Interest has focused recently on plant cell and tissue culture as an approach to control 
quality and yield of vanilla production and to solve some of the agronomic problems 
associated with growing vanilla. Another possible means for producing vanillin is through 
the use of microorganisms engineered to possess the requisite complement of vanillin 
biosynthetic enzymes. 

Several C6-C3 source compoimds, mostly eugenol and ferulic acid, are currently in use 
in conjunction with fermentation technologies, for the biotechnological production of vanillin 
(Benz, 1996, Biotechnological production of vanillin. In: AJ Taylor and DS Mottram, Eds, 
Flavour science - Recent Development The Royal Society of Chemistry, Cambridge, UK, pp 
1 1 1-117). Eugenol, a major aromatic constituent in clove oil, is converted by a Psetidomonas 
strain to ferulic acid through successive steps entailing formation of coniferyl alcohol, 
coniferyl aldehyde and, finally, feruUc acid. FeruUc acid is present also in cereal crops where 
the compound is esterified to arabinose moieties comprising aroimd 0.4 to 3.0 % of the cell 
wall material (Walton et al., 2000, Curr. Op. Biotechnol. U: 490-496). Femlic acid may be 
released from the cell wall matrix with the use of strong alkali or by enzymatic cleavage of 
the wall material using ciimamoyl esterase in combination with cell wall hydrolyzing 
enzymes (Williamson et al., 1998, Microbiology 144: 2011-2023). Such processes are 
expensive and time consuming, and can require specialized equipment. 

Moreover, bioconversion of ferulic acid to produce vanillin, as opposed to undesired 
by-products such as vanillic acid, heretofore has not been a straightforward process. 
Although ferulate is readily metabolized by various microbial systems, the end product is 
mostly vanillic acid (Dignum & Verpoorte, 2001, Food Rev. Int. 17: 199-219). 
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It appears that the mode of degradation of a three-carbon side chain of a 
hydroxycinnamic acid derivative, eugenol or ferulic acid for instance, to a single carbon 
moiety detemiines the metabolic fate of phenylpropanoid compounds. There are several 
reports on the in \dtro chain shortening-catalyzed degradation of C6-C3 to Ce-Ci compounds, 
5 such as benzoic acids and aldehydes, from hydroxycinnamic acids. One study on the 
synthesis of 4-hydroxybenzoate in L. erythrorhizon indicates that the pathway entails 
oxidation and cleavage of 4-coumaroyl CoA to 4-hydroxybenzoyl CoA and acetyl CoA in a 
thiolase type reaction with requirement for NAD (Loscher and Heide, 1994, Plant Physiol 
106 : 271-279). Tliis mode of enzyme action, involving oxidative chain shortening, may 
10 account for the formation of vanillic acid as an oxidative cleavage product from femUc acid, 
instead of the sought-after vanillin. 

Microorganisms capable of utilizing abundant and inexpensive starting materials to 
produce vanillin in a straightforward manner, without unwanted by-products are currently not 
available. Thus, a need exists for their creation and development. 

1 5 SUMMARY OF THE INVENTION 

The present invention features a transgenic microorganism that produces vanillin 
when provided with cafifeic acid or derivative thereof of esterified coumaric acid. The 
organism is transformed with expressible nucleic acid sequences encoding (1) a 3-0- 
methyltransferase, preferably from a plant source, which converts cafifeic acid to ferulic acid 

20 and (2) either a eukaryotic (preferably plant) non-oxidative chain-shortening enzyme or a 
bacterial CoA ligase and enoyl-CoA hydratase/lyase enzymatic system, either of which 
converts ferulic acid to vanillin. In one embodiment, the microorganism comprises, naturally 
or via recombinant means, an expressible nucleic acid molecule encoding an esterase that 
converts caffeic acid esters (e.g., cichoric acid, rosmarinic acid or chlorogenic acid) to caffeic 

25 acid. 

In one embodiment, the transgenic microorganism is a procaryote, such as E. coli, 
Pseiidomonas or any other prokaryotic microorganism that can be transformed and used for 
expression of foreign proteins. In another embodiment, the transgenic microorganism is a 
eucaryote, such as the yeasts Saccharomyces cerevisiae or, in a preferred embodiment, Pichia 
30 pastoris. The microorganism preferably one that does not degrade or ftirther metabolize 
vanillin, once it is produced. 

The present invention also features a method for producing vanillin, which 
comprises: (a) providing a transgenic organism that produces vanillin when provided with 
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caffeic acid or an esterified derivative thereof, as described above; (b) culturing the 
transgenic organism in the presence of the caffeic acid or derivative thereof, under conditions 
whereby the transgenic organism produces vanillin; and (c) recovering the vanillin from the 
culture. 

5 Another aspect of the invention features an O-methyltransferase from Vanilla 

planifolia that catalyzes methylation of substrates selected from the group consisting of 5- 
OH-ferulic acid ethyl ester, caffeic acid ethyl ester, caffeoyl aldehyde, 5-OH- 
coniferaldehyde, 5-OH- femlic acid, 3,4-dihydroxybenzaldehyde and caffeic acid. In one 
embodiment, the enzyme has an amino acid sequence at least 90% identical to SEQ ID NO:2, 
10 and more specifically comprises amino acid SEQ ID NO:2. 

Also featured is an isolated nucleic acid molecule that encodes the O- 
meiiiyltransferase described above. In one embodiment, the nucleic acid encodes a 
polypeptide having an amino acid sequence at least 90% identical to SEQ ID NO:2 and more 
specifically encodes a polypeptide having SEQ ID N0:2. In an exemplary embodiment, the 
1 5 nucleic acid molecule has a sequence comprising SEQ ID NO: 1 . 

Additional features and advantages of the present invention will be understood by 
reference to the drawings, detailed description and examples that follow. 

BRIEF DESCRIPTION OF THE DRAWINGS 

Figure 1. Schematic diagram showing the biotransformation of cichoric acid to 
20 vanilliiL 

Figure 2. Schematic diagram showing the biotransformation of rosmarinic acid to 
vanillin. 

DETAILED DESCRIPTION OF ILLUSTRATIVE EMBODIMENTS 
I. Definitions 

25 Various terms relating to the biological molecules of the present invention are used 

hereinabove and also throughout the specification and claims. 

With reference to nucleic acid molecules the term "isolated nucleic acid" or "isolated 
polynucleotide" is sometimes used. This term, when applied to DNA, refers to a DNA 
molecule that is separated firom sequences with which it is immediately contiguous (in the 5' 

30 and 3' directions) in the naturally occiirring genome of the organism fi-om which it was 

derived. For example, the "isolated nucleic acid" may comprise a DNA molecule inserted 
into a vector, such as a plasmid or virus vector, or integrated into the genomic DNA of a 
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procaryote or eucaryote. An "isolated nucleic acid molecule" may also comprise a cDNA 
molecule. With respect to KNA molecule, the term "isolated nucleic acid" primarily refers to 
an RNA molecule encoded by an isolated DNA molecule as defined above. Altematively, 
the term may refer to an KNA molecule that has been sufficiently separated firom RNA 
molecules with which it would be associated in its natural state (i.e., in cells or tissues), such 
that it exists in a "substantially pure" form (the term "substantially pure" is defined below). 

With respect to protein, the term "isolated protein" or "isolated and purified protein" 
is sometimes used herein. This term refers primarily to a protein produced by expression of 
an isolated nucleic acid molecule. Altematively, this terai may refer to a protein which has 
been sufficiently separated from other proteins with which it would naturally be associated, 
so as to exist in "substantially pure" form. 

The term "substantially pure" refers to a preparation comprising at least 50-60% by 
weight the compound of interest (e.g., nucleic acid, oligonucleotide, protein, etc.). More 
preferably, the preparation comprises at least 75% by weight, and most preferably 90-99% by 
weight, the compound of interest. Purity is measured by methods ^propriate for the 
compound of interest (e.g. chromatogr^hic methods, agarose or polyacrylamide gel 
electrophoresis, HPLC analysis, and the like). 

The term "enzyme" refers to a protein having enzymatic activity. As used herein, the 
tenn enzyme may refer to the singular or plural, in instances where two or more enzymes 
form an enzymatic system to convert one substance into mother. 

"Antibodies" as used herein includes polyclonal and monoclonal antibodies, chimeric, 
single chain, and humanized antibodies, as well as Fab fragments, including the products of 
an Fab or other immunoglobuUn expression library. With respect to antibodies, the term, 
"immunologically specific" refers to antibodies that bind to one or more epitopes of a protein 
of interest, but which do not substantially recognize and bind other molecules in a sample 
containing a mixed population of antigenic biological molecules. 

"Variant" as the term is used herein, is a polynucleotide or polypeptide that differs 

from a reference polynucleotide or polypeptide respectively, but retains essential properties. 

A typical variant of a polynucleotide differs in nucleotide sequence from another, refearence 

polynucleotide. Changes in the nucleotide sequence of the variant may or may not alter the 

amino acid sequence of a polypeptide encoded by the reference polynucleotide. Nucleotide 

changes may result in amino acid substitutions, additions, deletions, fusions and truncations 

in the polypeptide encoded by the reference sequence, as discussed below. A typical variant 

of a polypeptide differs in amino acid sequence from another, reference polypeptide. 

-5- 
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Generally, differmces are limited so that the sequences of the reference polypeptide and the 
variant are closely similar overall and, in many regions, identical. A variant and reference 
polypeptide may differ in amino acid sequence by one or more substitutions, additions, 
deletions in any combination. A substituted or inserted anmio acid residue may or may not 
be one encoded by the genetic code. A variant of a polynucleotide or polypeptide may be 
naturally occurring, such as an allelic variant, or it may be a variant that is not known to 
occur naturally. Non-naturally occurring variants of polynucleotides and polypeptides may 
be made by mutagenesis techniques or by direct synthesis. 

The temi "substantially the same" refers to nucleic acid or amino acid sequences 
having sequence variations that do not materially affect the nature of the protein (i.e. the 
structure, stability characteristics, substrate specificity and/or biological activity of the 
protein). With particular reference to nucleic acid sequences, the term "substantially the 
same" is intended to refer to the coding region and to conserved sequences governing 
expression, and refers primarily to degenerate codons encoding the same amino acid, or 
alternate codons encoding conservative substitute amino acids in the encoded polypeptide. 
With reference to amino acid sequences, tiie term "substantially the same" refers generally to 
conservative substitutions and/or variations in regions of the polypeptide not involved in 
determination of structure or function. 

The terms "percent identical" and "percent similar" are also used herein in 
comparisons among amino acid and nucleic acid sequences. When referring to amino acid 
sequences, "identity" or "percent identical'* refers to the percent of the amino acids of the 
subject amino acid sequence that have been matched to identical amino acids in the compared 
amino acid sequence by a sequence analysis program. "Percent similar" refers to the percent 
of the amino acids of the subject ammo acid sequence that have been matched to identical or 
conserved amino acids. Conserved ammo acids are those which differ in structure but are 
surdlar in physical properties such that the exchange of one for another would not appreciably 
change the tertiary structure of the resultmg protein. Conservative substitutions are defined 
in Taylor (1986, J. Theor. Biol. 1 19:205). When referring to nucleic acid molecules, "percent 
identical" refers to the percent of the nucleotides of the subject nucleic acid sequence that 
have been matched to identical nucleotides by a sequence analysis program. 

"Identity" and "similarity" can be readily calculated by known methods. Nucleic acid 

sequences and amino acid sequences can be compared using computer programs that aUgn 

the similar sequences of the nucleic or amino acids and thus define the differences. In 

preferred methodologies, tlie BLAST programs (NCBI) and parameters used therein are 

-6- 
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employed, and the DNAstar system (Madison, WI) is used to align sequence fragments of 
genomic DNA sequences. However, equivalent alignments and similarity/identity 
assessments can be obtained through flie use of any standard aUgranent software. For 
instance, the GCG Wisconsin Package version 9.1, available from the Genetics Computer 
5 Group in Madison, Wisconsin, and the default parameters used (gap creation penalty=12, gap 
extension penalty=4) by that program may also be used to compare sequence identity and 
similarity. 

With respect to single-stranded nucleic acid molecules, the term "specifically 
hybridizing" refers to the association between two single-stranded nucleic acid molecules of 

1 0 sufficiently complementary sequence to permit such hybridization under pre-detennined 

conditions generally used in the art (sometimes termed "substantially complementary"). In 
particular, the term refers to hybridization of an oUgonucleotide with a substantially 
complementary sequence contained within a single-stranded DNA or RNA molecule, to the 
substantial exclusion of hybridization of the oligonucleotide with single-stranded nucleic 

1 5 acids of non-complementary sequence. 

A "coding sequence" or "coding region" refers to a nucleic acid molecule having 
sequence information necessary to produce a gene product, when the sequence is expressed. 

The term "operably linked" or "operably inserted" means that the regulatory 
sequences necessary for expression of the coding sequence are placed in a nucleic acid 

20 molecule in the appropriate positions relative to the coding sequence so as to enable 
expression of the coding sequence. This same defmition is sometimes applied to the 
arrangement other transcription control elements (e.g. enhancers) m an expression vector. 

Transcriptional and translational control sequences are DNA regulatory sequences, 
such as promoters, enhancers, polyadenylation signals, terminators, and the like, that provide 

25 for the expression oif a codiag sequence in a host cell. 

The terms "promoter", "promoter region" or "promoter sequence" refer generally to 
transcriptional regulatory regions of a gene, which may be foxmd at the 5' or 3* side of the 
coding region, or within the codmg region, or within introns. Typically, a promoter is a DNA 
regulatory region capable of binding RNA polymerase in a cell and initiating transcription of 

30 a downstream (3' direction) coding sequence. The typical 5* promoter sequence is bounded at 
its 3' terminus by the transcription initiation site and extends upstream (5' direction) to 
include the minimum number of bases or elements necessary to initiate transcription at levels 
detectable above background. Within the promoter sequence is a transcription initiation site 
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(conveniently defmed by mapping with nuclease SI), as well as protein binding domains 
(consensus sequences) responsible for the binding of RNA polymerase. 

A "vector" is a replicon, such as plasmid, phage, cosmid, or virus to which another 
nucleic acid segment may be operably inserted so as to bring about the replication or 
5 expression of the segment. 

The term "nucleic acid constmct" or "DNA construct" is sometimes used to refer to a 
coding sequence or sequences operably linked to appropriate regulatory sequences and 
inserted into a vector for transforming a cell. This term may be used interchangeably with 
the term "transforming DNA" or "transgene". Such a nucleic acid construct may contain a 
10 coding sequence for a gene product of interest, along with a selectable marker gene and/or a 
reporter gene. 

The temi "selectable marker gene" refers to a gene encoding a product that, when 
expressed, confers a selectable phenotype such as antibiotic resistance on a transformed cell. 
The term "reporter gene" refers to a gene that encodes a product which is easily 

1 5 detectable by standard methods, either directly or indirectly. 

A "heterologous" region of a nucleic acid construct is an identifiable segment (or 
segments) of the nucleic acid molecule within a larger molecule that is not found in 
association with the larger molecule in nature. Thus, when the heterologous region encodes a 
mammalian gene, the gene will usually be flanked by DNA that does not flank the 

20 mammahan genomic DNA in the genome of the source organism. In another example, a 
heterologous region is a constract where the coding sequence itself is not foxmd in nature 
(e.g., a cDNA where the genomic coding sequence contains introns, or synthetic sequences 
having codons different than the native gene). Allelic variations or naturally-occurring 
mutational events do not give rise to a heterologous region of DNA as defined herein. The 

25 term 'T)NA construct", as defined above, is also used to refer to a heterologous region, 
particularly one constructed for use in transformation of a cell. 

A cell has been "transformed" or "transfected" by exogenous or heterologous DNA 
when such DNA has been introduced inside the cell. The transforming DNA may or may not 
be integrated (covalently linked) into the genome of the cell. In prokaryotes, yeast, and 

30 mammalian cells for example, the transforming DNA may be maintained on an episomal 

element such as a plasmid. With respect to eukaryotic cells, a stably transformed cell is one in 
which the transforming DNA has become integrated into a chromosome so that it is inherited 
by daughter cells through chromosome replication. This stability is demonstrated by the 
ability of the eukaryotic cell to establish cell lines or clones comprised of a population of 
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daughter cells containing the transforming DNA, A "clone" is a population of cells derived 
from a single cell or conunon ancestor by mitosis. A "cell line" is a clone of a primary cell 
that is capable of stable growth in vitro for many generations. 

The following sections set forth the general procedures involved in practicing the 
present invention. To the extent that specific materials are mentioned, it is merely for the 
puipose of illustration, and is not intended to limit the invention. Unless otherv^ase specified, 
general biochemical and molecular biological procedures, such as those set forth in 
Sambrook et al.. Molecular Cloning, Cold Spring Harbor Laboratory (1989) or Ausubel et al. 
(eds). Current Protocols in Molecular Biology, John Wiley & Sons (2003) are used. 

II. Description 

In an effort to develop alternative methods for producing high qualit>' vanillin in an 
economically feasible way, the inventors have devised a biotransformative pathway that can 
be engineered into selected species of microorganisms, which results in production of vanillin 
from a readily available and inexpensive starting material, which is cafifeic acid or a 
substance that readily produces caffeic acid, such as cichoric acid, rosmarinic acid or 
chlorogenic acid (sometimes referred to collectively herein as "caffeic acid derivatives")- 

Representative schemes for vanillin biosynthesis in accordance with the invention are 
outlined in Figures 1 and 2. The schemes indicate that cafifeic acid, obtained in the native 
form or by hydrolysis of caffeic acid esters, is methylated to ferulic acid by 3-0- 
methyltransferase, a readily available gene product, which has also been cloned from K 
planifolia itself. In a preferred embodiment of the present invention, ferulic acid is thereafter 
converted in one-step process to vanillin by the action of a non-oxidative chain-shortening 
enzyme, exemplified by the F. planifolia 4-hydroxybenzaldahyde synthase (4-HBS) 
disclosed in U.S. Published AppUcation No. 2003/00701SS Al (April 2003) to Havkin- 
Frenkel et al. An alternative embodiment uses bacterial enzymes in a two-step non-oxidative 
process to convert ferulic acid to vanillin. 

Advantageous to the present invention, cafifeic acid or caffeic acid derivatives are 

abundant in several plant species. These compounds are readily hydrolyzed by esterase 

action, resulting in the release of caffeic acid (Nusslein et al. 2000, J. Nat. Prod. 63: 1615- 

161 S). Hence, hydrolytically produced caffeic acid, combined with the content of native 

caffeic acid, can yield 10 to 15% free caffeic acid on dry weight basis. Because caffeic acid 

or caffeic acid derivatives are present in plant tissues in a free form and because these 

compounds are readily extracted (e.g., by ethanol), these materials offer an important 

-9- 
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advantage as source compounds. By comparison, the use of fenilic acid itself as source 
material is predicated on enzymatic release of the compoimd from its bound form to cell 
walls. This process is complex, leading to impurities and, importantly, is costly. The method 
of tlie present invention averts these problems by making use of abundant and readily 
extractable caffeic acid or caffeic acid derivatives in plant tissues and by a direct methylation 
of caffeic acid to forai feraUc acid. The subsequent conversion of femlate to vanillin by a 
non-oxidative chain shortening enzyme also avoids ttie inadvertent conversion of Ce-Cs 
compound to vanillic acid or other unintended end products; a problem encoimtered in other 
production systems. 

Natural sources for caffeic acid and derivatives thereof include, but are not limited to, 
Echinacea spp. and other species in the mint family, and may further include any plant 
species that contain the compounds. In a preferred embodiment, cichoric acid is obtained 
from Echinacea spp. Table 1 shows other plant sources of caffeic acid or its derivatives. In 
addition to those listed in Table 1, plant species that contain caffeic acid or derivatives 
suitable for use in the present invention include, but are not limited to, liquorice (Glycyrrhiza 
glabra, G. inflata, G. uralensis), oregano {Origanum compactum and other species), sage 
{Salvia friiticosa\ carrot {Daucus carota\ femiel {Foeniculimi vulgare) and artichoke 
{Cynara cardunculus). 

Table 1. Abundance of Caffeic acid and Caffeic acid derivative in some plant species 

Plant Species Caffeic Rosmarinic * Chicoric ^ 

Acid Acid Acid 



%Dry Weight^ 

Ocimum basilicum 0.5-2.5 
Agastache sp. 1 .0-2.0 

Echinacea purpurea 3 .6 

Mentha sp. 1.0-2.0 
Rosmarinus officianalis 0.1-2.0 

* Ciimaniic acid, 3,4-dihydroxy-, 2-ester with 3-(3,4-diliydroxyphenyl) lactic acid 
^ Tartaric acid, bis (3,4-dihydroxyciBnamate) 
^ Value were obtained by HPLC analysis. 

The biotransformation of cichoric acid to vanillin is shown schematically in Figure 1. 
A similar biotransformation is accomplished using rosmarinic acid, as shown in Figure 2. 
Cichoric acid, rosmarinic acid and chlorogenic acid (5-caffeolylquinic acid) are all esters of 

- 10 - 
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caffeic acid, and can be converted to caffeic acid in a similar manner. Other caffeic acid 
esters that can be utilized as caffeic acid sources include, but are not lintiited to, 1- 
caffeolylquinic acid and 1,3-dicafifeolylquimc acid (cynarin). Hydrolysis of caffeic acid 
esters can be accomplished with heat, pressure and mild alkaline solution, or enzymatically 
5 by esterases, which is a preferred embodiment. Esterases suitable for catalyzing the 

conversion of caffeic acid esters to caffeic acid are known in the art and are present in plants, 
animals and many microorganisms. In the latter instance, therefore, the esterases often need 
not be engineered into such microorganisms because they exist there naturally. In a preferred 
embodiment a microorganism naturally capable of producing caffeic acid from esters thereof 

10 is utilized in the present invention. 

Caffeic acid is methylated to produce ferulic acid, using a 3-O-methyltransferase 
obtainable from numerous plant sources, among other organisms. This enzyme catalyzes a 
methylation at position 3 on the ring (and may also methylate position 5 if it is hydroxylated). 
Examples of suitable 3-methyltransferases include, but are not limited to (GenBank 

15 Accession Numbers follow each listed source organism): Catharanthus roseus, AY028439; 
ClarMa breweri, AF006009; Cqffea canephora, AF454631; Eucalyptus gimnii, X74814; 
Festuca arundinacea, AF153825; Hordeum vulgare, U54767; Hordeum vulgare, AB086416; 
Lolium perenne, AF010291; Medicago sativa, M63853; Nicotiana tabacum class I, X74452; 
Nicotiaria tabacum class II, X74452; Ocimum basilicum, AF154918; Populus tremidoides^ 

20 X62096; Pninus amygdalus, X83217; Sacchanim qfficinarum, AJ231133; Sorghum bicolor, 
AY2m66; Tltalictnim tuberosum, AF064696; Triticum aestivum, AY2265S1; and Zea 
mays, M73235. Nucleic acid and deduced amino acid sequences set forth in the 
aforementioned Accessions are each incorporated by reference herein in their entireties. 
Preferred for use is the 3-O-methyltrasferase from Medicago sativa (alfalfa). Also preferred 

25 for use is the 3-O-methyltransferase from Vanilla planifolia. A cDNA sequence (SEQ ID 
NO:l) and deduced amino acid sequence (SEQ ID NO:2) of this enzyme are shown in the 
Sequence Listing that forms part of this document. 

In one embodiment, feralic acid is converted to vanillin using a eukaryotic non- 
oxidative-chain shortening enzyme, such as the plant-derived 4-HBS described in U.S. 

30 Published Application No. 2003/0070188 Al to Havkin-Frenkel et al. (2003). Any similar 
eukaryotic aldehyde synthase that acts as a non-oxidative chain-shortening enzyme may also 
be utilized. Conversion of ferulic acid to vanillin by non-oxidative means offers the 
advantage of reducing or eliminating formation of imdesired vanillic acid, as discussed 
above. 

-11- 
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In an alternative embodiment, femlic acid is converted to vanillin using a bacterial 
chain shortening enzyme system, enoyl-SCoA hydratase/lyase (Gasson et al., 1998, J. Biol. 
Chem. 273: 4163-4170). The bacterial process is a two-step enzymatic process, involving 
first a CoA ligase (an enzyme found in eukaryotes, see, e.g.. Gross et al., 1973, FEES Lett 
31: 283-287, as well as bacteria) to activate ferulate to tiie CoA derivative and then a 
hydratase/lyase to enzymatically cleave the double bond, releasing vanilUn and acetyl-CoA. 
This enzyme system has been demonstrated to catalyze the conversion of ferulic acid to 
vanillin. It is found in a number of bacteria, including but not limited to Pseudomonas 
florescens (Civolani et al., 2000, Appl Environ Microbiol. 66: 231 1-2317; Narbad and 
Gasson, 1998, Microbiology 144: 12,91 -\A05), Pseudomonas putida (Venturi et al., 1998, 
Microbiology 144: 965-973), other Pseudomonas species (Overhage et al., 1999, Appl 
Microbiol Biotechnol. 52: 820-828; Overhage and Steinbuchel, 1999, Appl Environ 
Microbiol. 65: 4837-47) and Nocardia spp. (Li and Rosazza, 2000, Appl. Environ. Microbiol. 
66: 684-687). 

In certain instances, enoyl-SCoA hydratase/lyase may utiUze caffeic acid as a 
substrate to form 3,4-dihydroxy benzaldehyde. This product also may be converted to 
vanillin flirough the action of the above-described 3-O-methyltransferases. 

Expression vectors comprising DNA that encodes the aforementioned enzymes are 
introduced into a selected microorganism. Preferably, a microorganism that is amenable to 
genetic manipulation is utilized. In addition, it is preferred that the microorganism is not 
enable of degrading or further metaboUzing the end product, vanillin. Suitable 
microorganisms for practice of the invention include, but are not limited to, E. coli and 
Pseudomonas spp. as model procaryotic expression systems and yeast such as 
Saccharomyces cerevisiae or Pichia pastoris as model eucaryotic expression systems. 
Vectors and systems for transforming these and other organisms are well known in the art. 

After the microorganism has been engineered to express all enzymes necessary for the 

conversion of the caflfeic acid or its derivatives to vanillin, production of vanillin is 

accompUshed as follows: (1) grow the engineered microorganism in a suitable culture 

medium; (2) add the selected caffeic acid or derivative to the culture medium; (3) grow the 

culture for a time, and under conditions to enable production of vanillin, which preferably, 

but not necessarily, is secreted into the medium; and (4) recover the vanillin firom the cells or 

medium. Vanillin can be purified firom a solution by well-established methods (e.g., Priefert 

et al., 2001, Appl. Microbiol. Biotechnol. 56: 296-314; Klinke et al., 2002, Bioresource 

Technology §2: 15-26), used in vanillin manufacturing firom lignin, for instance. Examples 
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include vanillin volatilization from solutions above 80°C and crystallization from saturated 
solutions. 

The present invention also provides a novel multifunctional methyltransferase from 
Vanilla planifolia, and its encoding nucleic acid, both of which are useful in the practice of 
5 the present invention and for other purposes. This enzyme, referred to herein as "vpOMT" is 
capable of catalyzing tlie conversion of caffeic acid to ferulic acid, and also the conversion of 
3,4-dihydroxybenzaldehyde to vanillin. Details of the isolation and characterization of 
vpOMT, and the cloning of a cDNA encoding vpOMT, are set forth in the examples. 

A cDNA encoding vpOMT is set forth herein as SEQ ID NO: 1 , and its encoded 

10 protein is set forth herein as SEQ ID NO:2. Although this particular cDNA and polypeptide 
are described and exemplified herein, this invention is intended to encompass proteins from 
other Vanilla cultivars and species that are sufficiently similar to be used interchangeably 
with the characterized vpOMT for the purposes described herein. 

Accordingly, considered in terms of their sequences, vpOMT-encoding nucleic acids 

15 of the invention include allelic variants and natural mutants of SEQ ID NO: 1, which are 

likely to be found in different varieties of V. planifolia and Vanilla, and homologs of SEQ ID 
NO:l likely to be found in different plant species. Because such variants and homologs are 
expected to possess certain differences in nucleotide and amino acid sequence, this invention 
provides an isolated vpOMT-encoding nucleic acid molecule that encodes a vpOMT 

20 polypeptide having at least about 90% (and, with increasing order of preference, 91%, 92%, 
93%, 94%, 95%, 96%, 97%, 98% and 99%) identity with SEQ ID NO:2, with a 
corresponding level of nucleotide sequence identity' with respect to SEQ ID NO:l . Because 
of the natural sequence variation likely to exist among vpOMT enzymes and the genes 
encoding them in different plant varieties and species, one skilled in the art would expect to 

25 find this level of variation, while still maintaining the unique properties of the vpOMT of the 
present invention. Such an expectation is due in part to the degeneracy of the genetic code, 
as well as to the known evolutionary success of conservative amino acid sequence variations, 
which do not appreciably alter the nature of the encoded protein. Accordingly, such variants 
and homologs are considered substantially the same as one another and are included within 

30 the scope of the present invention. 

VpOMT-encoding nucleic acid molecules of the invention may be prepared by two 
general methods: (1) they may be synthesized firom appropriate nucleotide triphosphates, or 
(2) they may be isolated firom biological soiarces. Both methods utilize protocols well known 
in the art. 

- 13- 



wo 2004/036979 



PCT/US2003/03401 1 



The availability of nucleotide sequence information, such as the cDNA having SEQ 
ID NO:l, enables preparation of an isolated nucleic acid molecule of the invention by 
oligonucleotide synthesis. Synthetic oligonucleotides may be prepared by the 
phosphoramadite method employed in the AppUed Biosystems 38 A DNA Synthesizer or 
similar devices. The resultant construct may be purified according to methods known in the 
art, such as high performance liquid chromatography (HPLC). 

VpOMT genes also may be isolated firom appropriate biological sources using 
methods known in the art. Nucleic acids having the appropriate level sequence homology 
with part or all of SEQ ID NO: 1 may be identified by using hybridization and washing 
conditions of appropriate stringency. For example, hybridizations may be performed, 
according to the method of Sambrook et al., using a hybridization solution comprising: 5X 
SSC, 5X Denhardfs reagent, 1.0% SDS, 100 |Lig/ml denatured, fi-agmented sahnon sperm 
DNA, 0.05% sodium pyrophosphate and up to 50% fonnamide. Hybridization is carried out 
at 37-42**C for at least six hours. Following hybridization, filters are washed as follows: (1) 5 
minutes at room temperature in 2X SSC and 1% SDS; (2) 15 minutes at room temperature in 
2X SSC and 0.1% SDS; (3) 30 minutes-1 hour at 3TC in 2X SSC and 0.1% SDS; (4) 2 
hours at 45-55**Cin 2X SSC and 0.1% SDS, changing the solution every 30 minutes. 

One common formula for calculating the stringency conditions required to achieve 
hybridization between nucleic acid molecules of a specified sequence homology (Sambrook 
etal., 1989): 

Tn, = 81 . SEC + 16.6Log [Na+] + 0.41(% G+C) - 0.63 (% fomiamide) - 600/#bp in duplex 

As an illustration of the above formula, using [N+J = [0.368] and 50% fonnamide, 
with GC content of 42% and an average probe size of 200 bases, the Tm is 57°C. The Tmof a 
DNA duplex decreases by 1 - 1.5°C with every 1% decrease in homology. Thus, targets with 
greater than about 75% sequence identity would be observed using a hybridization 
temperature of 42^C . 

The stringency of the hybridization and wash depend primarily on the salt 

concentration and temperature of the solutions. In general, to maximize the rate of annealing 

of the probe with its target, the hybridization is usually carried out at salt and temperature 

conditions that are 20-25 °C below the calculated Tm of the of the hybrid. Wash conditions 

should be as stringent as possible for the degree of identity of the probe for the target. In 

general, wash conditions are selected to be approximately 12-20®C below the Tm of the 
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hybrid. In regards to the nucleic acids of the current invention, a moderate stringency 
hybridization is defined as hybridization in 6X SSC, 5X Denhardt's solution, 0.5% SDS and 
100 ng/ml denatured salmon sperm DNA at 42''C, and wash in 2X SSC and 0.5% SDS at 
55**C for 15 minutes. A high stringency hybridization is defined as hybridization in 6X SSC, 
5 5X Denhardt's solution, 0.5% SDS and 100 |Lig/ml denatured sahnon sperm DNA at 42''C, 
and wash in IX SSC and 0.5% SDS at 65°C for 15 minutes. A very high stringency 
hybridization is defined as hybridization in 6X SSC, 5X Denhardt's solution, 0.5% SDS and 
100 |ag/ml denatured salmon sperai DNA at 42°C, and wash in O.IX SSC and 0.5% SDS at 
65 C for 15 minutes. 

10 Nucleic acids of the present invention may be maintained as DNA in any convenient 

cloning vector. In a preferred embodiment, clones are maintained in plasmid 
cloning/expression vector, such as pGEM-T (Promega Biotech, Madison, WT) or pBluescript 
(Stratagene, La JoUa, CA), either of which is propagated in a suitable E. coli host cell. 

VpOMT nucleic acid molecules of the invention include cDNA, genomic DNA, 

15 RNA, and fi*agments thereof which may be single- or double-stranded. Thus, this invention 
provides oligonucleotides (sense or antisense strands of DNA or RNA) having sequences 
capable of hybridizing with at least one sequence of a nucleic acid molecule of the present 
invention, such as selected segments of SEQ ID NO: 1 . 

VpOMT polypeptides may be prepared in a variety of ways, according to known 

20 methods. In one embodiment the protein is purified fi*om appropriate sources, e.g., plant 
tissue as described in the examples. 

Altematively, the availability of nucleic acid molecules encoding the polypeptides 
enables production of the proteins using in vitro expression methods known in the art. For 
example, vpOMT may be produced by expression in a suitable procarj^otic or eucaryotic 

25 system. Part or all of a DNA molecule, such as the cDNA having SEQ ID NO:l, may be 

inserted into a plasmid vector adapted for expression in a bacterial cell (such as E. coli) or a 
yeast cell (such as Saccharomyces cerevisiae\ or into a baculovims vector for expression in 
an insect cell. Such vectors comprise the regulatory elements necessary for expression of the 
DNA in the host cell, positioned in such a manner as to permit expression of the DNA in the 

30 host cell. Such regulatory elements required for expression include promoter sequences, 
transcription initiation sequences and, optionally, enhancer sequences. 
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Polyclonal or monoclonal antibodies directed toward any of the peptides encoded by 
vpOMT may be prepared according to standard methods. Monoclonal antibodies may be 
prepared according to general methods of Kohler and Milstein, following standard protocols. 

The following examples are provided to describe the invention in greater detail. They 
are intended to illustrate, not to limit, the invention. 

EXAMPLE 1 

Methods for Isolation and Purification of vpOMT protein and cDNA 

Plant material 

Tissue cultures of V. planifolia were initiated and maintained as described by 
Podstolski et al. (2002, Phytochemistry 61: 611-620). Plants of F. planifolia were 
maintained in the greenhouse and were the source of stem, leaf, and root tissues. Green V. 
planifolia pods at different stages of developmrat were obtained from Indonesia. 

Enzyme exti'action and assay 

Preparation of cmde protein extracts of the V. planifolia pods and tissue cultures 
grown in Uquid media was modified from that described by Wang et al. (1997, Plant Physiol. 
114 : 213-221). For determining the presence of DOMT activity, 3 g tissue was homogenized 
in 6 ml of 50 mM BisTris-HCl, pH 6.9, 10 mM 2-mercaptoethanol, 5 mM Na2S205, 1% (w/v) 
PVP-40, 1 mM phenyhnethanesulfonyl fluoride (PMSF), and 10% (v/v) glycerol. The 
homogenate was filtered through cheesecloth and centrifiiged 15 min at 10,000 g at 4 °C. 

Protein concentrations were determined using the Bio-Rad protein assay reagent (Bio- 
Rad, Richmond, CA) with bovine serum albumin as a standard. 

O-Methyltransferase assays were as described by Wang et al. (1997, supra). Assays 
were done in 50 ^1 volumes and were composed of 10 |il assay buffer (250 mM Tris-HCl, pH 
7.5, 10 mJM DTT), 1 |il of 50 mM substrate, 10 |al enzyme (crude extracts or fractions from 
partial purification), and 1 \i\ of S-[methyl-^'^C]adenosyl-L-methionine (SAM) (59 
mCi/mmol)(Amersham Pharmacia Biotech, Buckinghamshire, England), and 28 \i\ water. 
The samples were incubated at 30 °C for 30 min, after which the reactions were stopped by 
adding 2.5 jil of 6 M HCl. [*^C]SAM was separated from the radiolabeled methylated 
product by extraction with 100 jil ethyl acetate. Twenty ^1 of the organic phase containing 
the labeled product was used for liquid scintillation counting. The counts per min were 
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converted to pkat (picomoles of product produced per second), based on the specific activity 
of the substrate and the efficiency of the scintillation counter. 

For determination of the kinetic parameters, the reaction conditions were modified to 
include 2 ^1 [^"^CJSAM, 100 ^iM unlabelled SAM, and 3 of the purified recombinant 
protein expressed in E. colL Substrate concentrations ranged firom 0.001 mM to 4 mM. All 
reactions were done in duplicate. Vmax and Km were calculated firom nonlinear regressions 
of the Michaelis-Menton plots using the program Prism 4 (GraphPad Software, Inc., San 
Diego, CA). 

Tltin layer chromatography 

The identity of vanillm as the labeled reaction product following methylation of 3,4- 
dihydroxybenzaldehyde was confirmed by TLC analysis. Twenty \x\ aliquots of the organic 
extract were spotted onto a 20 cm x 20 cm siUca gel 60 precoated TLC plate (EM Industries, 
Inc, Gibbstown, NJ). Twenty jil each of 10 mM vanillin, 10 mM 3,4-dihydroxybenzaldehyde 
and a mixture of both were also spotted as standards. The plate was developed m a solvent 
system of chloroformracetic acid (9:1, v/v). To visualize the standards following 
chromatography, the plate was allowed to dry and examined under UV light. The region of 
the plate from the reaction product that corresponded to the position of standard vanillin was 
scraped into scintillation vials and coimted. 

Partial purification of V. planifolia OMT 

For protein purification, a cmde extract of the tissue culture was prepared by 
homogenizing in 10 volumes fresh weight of the extraction buffer. Partial purification of 
DOMT activity from the crude extract on an adenosine-agarose affinity column was modified 
from that described by Wang and Pichersky (1998, Arch. Biochem, Biophys. 349:153-160). 
A 1 ml adenosine-agarose (Sigma, St. Loxiis, MO) column was prepared as previously 
described (Attieh et al., 1995, J. Biol. Chem. 270: 9250-9257). Ten ml of tissue culture crude 
extract was applied to the adenosine-agarose colxmm. The colimm was washed with 6 ml 50 
mM Bis-Tris, pH 6.9, 10 mM 2-mercaptoethanol, 10% glycerol followed by elution with 10 
ml wash buffer containing 2.5 mM adenosine. One ml fractions were collected and assayed 
for DOMT and COMT activities. Fractions containing activity were combined and 
concentrated using Microcon YM30 devices (Amicon, Beverly, MA). 
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PCR amplification of O-methyltramferase cDNAfi-agment 
Degenerate oligonucleotide primers for PCR were designed based on conserved 
sequences in COMTs from other plant species. The amino acid sequences encoded by the 
primers were VLMESWY and HVGGDMF, respectively. 

The degenerate oUgonucleotide primers were used in PCR amplification of the cDNA 
Ubrary prepared from the V. plaiiifolia tissue cultures. PCR reactions were carried out using 
the Elongase Amplification System (Invitrogen. Carlsbad, CA). The 100 nl reactions 
contained 60 mM Tris-SO*, pH 9.1. 18 mM (NH4)2S04, 1.5 mM MgS04, 200 ixM each 
dNTP, 3 ng of each oUgonucleotide, and 2 nl Elongase enzyme mix. PCR was carried out in 
a GeneAmp 9600 theimocycler (Perkin Ebner Life Sciences. Boston, MA). Touchdown PCR 
cycling parameters were used. Initial denaturation was conducted at 94 °C for 30 s. Cycle 1 
consisted of denaturation at 94 "C for 30 s, annealing at 66 »C for 30 s. and extension at 68 
°C for 2 min. Every two subsequent cycles, the annealing temperature was decreased by 1 "C 
until 56 °C was reached. An additional 30 cycles at an annealing temperature of 56 ''C were 
performed, followed by a final extension at 

68 °C for 10 min. PCR products were resolved on a 1.2% (w/v) agarose gel, and a 
single band of about 350 bp was detected. The DNA band was excised and purified using a 
commercial kit (QIAquick Gel Extraction Kit, Qiagen USA). The purified band was Ugated 
into the pGEM-T Easy vector (Promega, Madison, WI) and transformed into JM109 E. coli 
coihpetent cells. Plasmids were purified from E. coli tiransformants using a commercial kit 
(QIAprep Spin Miniprep Kit, Qiagen) and sequenced using SP6 and T7 primers. 



cDNA library screening 

A cDNA library was constincted by Sti^tagene (LaJoUa, CA) in the X ZAP-Express 
vector using poly(A^ RNA from V. planifolia tissue cultiire. Four hundred and fifty 
thousand plaque-forming units were screened using the 350 bp PCR clone as probe. The 
cloned 350 bp fragment was labeled with [a^-P]dCTP using a commercial kit (Prime-It H 
Random Primer LabeUng Kit, Stratagene). 

The plaque Ufts were prehybridized at 42 °C in 50% (v/v) fomiamide, 5X SSC, 5X 
Denhardt's solution [IX Denhardt's solution is 0.02% (w/v) FicoU, 0.02% (w/v) PVP, 0.02% 
(w/v) BSA], 50 mM sodium phosphate, pH 6.8, 1% (w/v) SDS, lOO ng ml'' calf thymus 
DNA, and 2.5 % (w/v) dexti^ sulfate. The hybridization solution was 5 X 10^ cpm ml * of 
32p-labeled fragment, 50% (v/v) formamide, 5X SSC, IX Denhardt's solution, 20 mM 
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sodium phosphate, pH 6.8, 1% (w/v) SDS, 100 ng ml ' calf thymus DNA, and 5% (w/v) 
dextran sulfate. Hybridized membranes were washed with 2X SSPE, 0.5% (w/v) SDS for 15 
minutes at room temperature, 2X SSPE, 0.5% (w/v) SDS for 15 minutes at 65 »C, and 0.2X 
SSPE, 0.2% (w/v) SDS for 15 minutes at 65 "C. Tlie washed filtears were exposed to X-Ray 
fibn (XOMAT-AR, Kodak, Rochester, NY) with an intensi^dng screen. Positive plaques 
were subjected to two additional rounds of screening to isolate single positive plaques. The 
cDNA inserts from positive plaques were excised from the A.-vector as recombinant pBK- 
CMV phagemids. A full-length clone was completely sequenced by primer walking. 

Expression of the V. planifolia OMT in Escherichia coli 

The coding sequence of the OMT was ampUfied by PGR using oligonucleotides that 
introduced JiTioI sites at the 5' and 3' ends. The PGR amphfication product was separated on 
a 1% (w/v) agarose gel and the DNA band excised from the gel and extracted using a 
commercial kit (QIAquick Gel Extraction Kit, Qiagen). The PGR product was digested with 
Xliol and again gel purified. The digested PGR product was then ligated to the ATioI-digested 
dephosphorylated pET-15b expression vector (Novagen, Madison, WT) and transformed into 
ElectroMAX™ DHIOB cells (fiivitrogen) via electroporation. Plasmids from positive 
transformants ware completely sequenced to confirm fliat no errors had been introduced 
through the PGR process. A plasmid containing the pearfect OMT sequence was then 
transformed in BL2 1(DE3) cells (Novagen) for protein expression. 

A BL21(DE3) OMT transfoimant was grown at 37°G in LB with 50 ^g ml'' 
ampicillin to ODgoo = 0.5. Protein expression was then induced by adding IPTG to 0.05 mM. 
Additional 50 p,g ml'' ampicillin was also added and the cells grown ovemiglit at 20 °C. The 
cells were collected by centrifiigation at 12,000g for 15 min, lysed using BugBuster™ 
Protein Extraction Reagent (Novogen) and treated with Benzonase Nuclease (Novogen) 
according to the manufacturer's instructions. Gell debris was removed by centrifugation at 
12,000g for 20 min, the clarified lysate was applied to a His-Bind column (Novogen), and the 
expressed OMT protein eluted according to the manufacturer's instructions. The eluted 
protein was passed through a PDIO column (Amersham Pharmacia Biotech AB, Uppsala 
Sweden) equilibrated with the OMT assay buffer and concentrated 3-fold using Ultrafree-4 
centrifugal filter units (Millipore Gorporation, Bedford, MA). The concentrated protein was 
used for enzyme activity assays. 
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Antibody production mid immimoblot analysis 

The purified recombinaBt protein was used for preparation of V. planifolia OMT- 
specific antiserum. The purified protein was mixed witii an equal volume of Freund's 
complete (first injection) or incomplete (subsequent injections) adjuvant and was injected 
into the subscapular space of a rabbit. Three injections of about 100 fig of protein each were 
given at 4-week intervals. 

For immunoblot analysis, proteins were extracted by homogenizing tissue samples in 
phosphate-buffered saline (1 .5 mM NaH2P04, 8.1 mM Na2HP04, 145.5 mM NaCl) in a ratio 
of 0.4 g 800 The extracts were centrifuged to remove debris and the protein 
concenti^tions of the supematants determined using the Bio-Rad protein assay reagent. 
Twenty (ig of protein was mixed with an equal volume of 2x sodium dodecyl sulfate (SDS) 
sample buffer [2x: 125 mM Tris, pH 6.8, 4.6% (w/v) SDS, 10% (v/v) 2-merc^toethanol, 
20% (v/v) glycerol and 0.002% bromophenol blue (w/v). The proteins were transferred to 
nitrocellulose membranes (NittoPure, Osmonics, Westborough, MA) in 10 mM 3- 
(cyclohexylamino)-l-propane-sulfonic acid (CAPS), pH 11, 10% methanol (v/v). Processing 
and detection by chemiluminescence (Western Lightening Chemiluminescence Kit, Peridn 
Ekner Life Science) was according to the manufecturer's instructions. 

EXAMPLE 2 

rharacterization of vdOMT Activitv. Protein and cDNA 

3,4-Dihydroxybenzaldehyde-O-methyltransferase actixnty in V. planifolia pods and 
tissue culture 

A three-step pathway for vanillin biosynthesis from 4-co\Mnaric acid has been 
proposed based on precursor accumulation and on feeding cell cultures of V. planifolia with 
the proposed precursors (Havkin-Frenkel et al., in: T.J. Fu, G. Singh, W.R. Curtis (Eds.), 
Plant Cell and Tissue Culture for the Production of Food hig redients. Kluwer Academic 
Press/Plenum Publishers, New York, 1999, pp 35-43). In this pathway 4-comnaric acid is 
first converted to 4-hydroxybenzaldehyde through a chain-shortening step. Hydroxylation at 
position 3 on the ring results in 3,4-dihydroxybenzaldehyde (also called protocatechuic 
aldehyde). The 3-hydroxyl group is then methylated producing vanilUn. An enzyme firom V. 
planifolia that catalj^es the chain-shortening step, 4-hydroxybenzaldehyde synthase, has 
been isolated as described hereinabove (see also, Podstolski et al., 2002, supra). 
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The proposed 3-step vanillin biosynthetic pathway postulates a 3,4- 
dihydroxybenzaldehyde-O-methyltransferase (DOMT) activity as the final step resulting in 
the production of vanilUn. Green K planifolia pods at different stages of development were 
obtained fi-om Indonesia. Crude extracts of the inner region of the pods where vanillin is 
synthesized were assayed for DOMT activity by following the transfer of [''^C] from 
radiolabelled SAM to 3,4-dihydroxybenzaldehyde. DOMT activity doubled between 3 and 5 
months after pollination and was maintained at a similar level through 1 1 months after 
pollination (Table 2). The increase in DOMT activity at 5 months after polhnation 
corresponded to the developmental stage when vanillin accumulation in the pods begins. 

Table 2. DOMT and COMT activities (pkat mg'' protein) in V. planifolia pods and tissue 
culture extracts. Values presented are the means of dupUcate assays. 

Sample DOMT Activity COMT Activity 

15 phot mg'' 



10 



0.37 NX>.* 



Pods, crude extracts 

Months after pollination 
3 

5 0.90 N.D. 

20 8 0.78 N.D. 

11 0.80 N.D. 
Tissue culture 

Crude extract 6.96 0.79 

Adenosine column . 17.5 13.2 
25 ^ Not determined 



Tissue cultures of V. planifolia have been estabUshed that accumulate vanilUn and its 
proposed precursors, including 3,4-dihydroxybenzaldehyde (Havkin-Frenkel et al., 1996, 
30 Plant CeU Tiss Org Cult 45: 133-136). Crude extracts of the tissue cultures were found to 
have both DOMT and COMT activities (Table 2). With 3,4-dihydroxybenzaldehyde as the 
substrate, [^''Clvanillin was identified as the product by co-migration with unlabeled standard 
vanillin on a TLC plate. Seventy-eight percent of the radioactivity present in the crude 
reaction product was recovered fi-om the TLC plate at the position of authentic vanillin. 
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Partial piirification of DOMT activity from V. plamfolia tissue culture 
Since the V. planifolia tissue cultures had DOMT activity at similar levels to the pods 
and were a conveniait source of plant material, the first approach to characterizing the 
enzyme was to purify it firom the tissue cultures. Affinity purification by binding to 
adenosine-conjugated agarose has been successful in purifiying some OMTs. Both DOMT 
and COMT activities could be partially co-purified firom the tissue culture crude extract by 
chromatography on an adenosine-agarose column (Table 2). SDS gel analysis of the active 
fi-actions revealed a major band at approximately 42 kD and a roinor band at approximately 
27 kD. COMTs fi-om other species are in the range of 37.6-42.3 kD. The 42 kD band seen in 
the SDS gel of the active firactions appeared to be a single band and was likely the source of 
the O-metiiyltransferase activities. Peptide sequencing of the 42 kD band, however, revealed 
it was heterogeneous and no sequences shnilar to COMTs wore obtained. Additional 
purification attempts were made using other column chromatography methods, but none were 
successfid in separating the DOMT and COMT activities firom each other. 

V. planifolia O-methyltraitsferase cDNA done 

To test whether the DOMT activity detected in V. planifolia tissue cultures originated 
fi-om a multifunctional methyltransferase that could methylate botii 3,4- 
dihydroxybenzaldehyde and caffeic acid, die inventors isolated a cDNA clone based on 
conserved sequences in COMTs from other species for expression in E. coll. Degenerate 
oUgonucleotides based on the peptide sequences VLMESWY and HVGGDMF were used in 
PCR of a cDNA library prepared from the V. planifolia tissue culture. A 350 bp ampUfied 
band was cloned whose sequence was similar to COMTs from other plants. The PCR clone 
was used to screen tiie cDNA library and a full-length clone was obtained. A 365 amino acid 
protein with a molecular weight of 40,659 daltons was predicted from the cDNA sequence. 

Similarity of the V. planifolia OMTto other sequences 

The V. planifolia OMT amino acid sequence is similar to COMTs reported from other 

plant species but the level of identity is not high to any other sequences currently in the 

database. COMT sequences previously reported to be from V. planifolia (Xue and Brodelius, 

1988, Plant Physiol. Biochem. 36: 779-788) have been withdrawn from the NCBI database 

and now appear to actually be from Catharanthus roseus (Schroder et al., 2002, 

Phytochemistry 59: 1-8). Phylogenetic analysis comparing 19 similar methyltransferase 

sequences illustrates the relationship of the V. planifolia OMT sequence to methyltransferases 
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reported from other species. The amino acid sequence of the V. planifolia OMT shows a 
similar level of divergence from the other monocot OMTs as from the dicot OMTs, perhaps 
reflecting its phylogenetic distance from the other reported monocot COMTs. F. planifolia is 
classified in the order Asparagales, whereas the other monocot species in the COMT 
5 sequence comparison are in the order Poales. 

Although there is considerable overall amino acid sequence variability among the 
monocot and dicot COMTs, all the residues identified from the crystal structure of the. alfalfa 
enzyme as being involved in substrate binding or positioning (Zubieta et al., 2002, Plant Cell 
14 : 1265-1277) are generally well-conserved among most of the enzymes, including the V, 

10 planifolia OMT. The one nonconserved substrate-binding residue in the V, planifolia enzyme 
is N185 which is HI S3 at the corresponding position of the alfalfa enzyme. Amino acid 
residues at the relative position of the alfalfa substrate-binding residue 1316 exhibit 
considerable variation among tiie other COMT sequences. 

Two tobacco COMTs that are quite different from each other have been reported and 

15 their substrate preferences have been compared (Maury et al., 1999, Plant PhysioL 121: 215- 
223; Pellegrini et al., 1993, Plant PhysioL 103 : 509-517). The relative substrate preferences 
of tobacco class I COMT were similar to those of the alfalfa enzyme whereas tobacco class n 
COMT had no activity against cafifeic acid and 5-OH-ferulic acid, but did have activity 
against 3,4-dihydroxybenzaldehyde (Maiuy et al., 1999, supra). Vanillin, the product of 3,4- 

20 dihydroxybenzaldehyde methylation, has been detected in tobacco and its accumulation was 
10-fold higher in a phenylalanine ammonia-lyase overexpressing cell line. The tobacco class 
n COMT does differ from the alfalfa sequence at 5 of the conserved substrate binding 
residues, suggesting these differences may relate to the observed differences in substrate 
preferences. 

25 Two barley sequences that are also quite different from each other have been reported 

as COMTs (Lee et al., 1997, DNA Seq. 7: 357-363; Sugimoto et al., 2003, Biosci. 
BiotechnoL Biochem. 67: 966-972). These two sequences stand out as different from the 
others in amino acid sequence comparisons in that there is sequence variation at a nimiber of 
the conserved substrate-binding residues, and even in some catalytic sites, suggesting a closer 

30 evaluation of the activities and substrate preferences of those enzymes may be interesting. 

Barley EST sequences that are more closely related to the wheat sequence and which do have 

the conserved substrate binding residues have been reported (GenBank Accession Numbers 

AL505122, AL5045S9, HVSMEn0023E17f, HVSMEn0007I18f, HVSMEn0023M14f, 

HVSSMEn0025G07f, and HVSMEn0009H02f). 
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Expression of V, planifolia OMT in E. coli 

The protein encoded by the K planifolia OMT cDNA was expressed as an N-temiinal 
polyhistidine-tagged fusion in E. coli from the expression vector pET-15b and the 
5 recombinant protein purified by affinity chromatography. The expressed protein tended to 
rapidly accumulate in insoluble inclusion bodies, so conditions were developed using a low 
concentration of IPTG and low incubation temperature to allow accumulation of soluble 
OMT protein. 

The kinetic parameters of tlie purified recombinant protein were determined A^dth 
10 several phenolic and phenylpropanoid substrates (Table 3). The enzyme exhibited a 

preference for 5-OH-feruUc acid ethyl ester and caffeic acid ethyl ester, although these are 
unlikely to serve as substrates in vivo, Caffeoyl aldehyde and S-OH-coniferaldehyde were 
preferred over 5-OH-ferulic acid, 3,4-dihyroxybenzaldehyde, or caffeic acid. In general, the 
relative substrate preferences for the F. planifolia enzyme were similar to those reported for 
15 alfalfa COMT (Parvathi et al., 2001, Plant J. 25: 193-202., which has been confirmed by 

down-regulation to be involved in S lignin biosynthesis (Guo et al., 2001 i Plant Cell 13: 73- 
88). This suggests that the K planifolia enzyme characterized here may also fimction 
primarily in the synthesis of lignin. 

20 Table 3. Relative substrate preferences of F. planifolia recombinant OMT 



— Substrates " vmax/J^ 

5-OH-Ferulic acid ethyl ester 38.6 

Caffeic acid ethyl ester 36.0 

Caffeoyl aldehyde 28.4 

25 5-OH-Coniferaldehyde 19.7 

5-OH-Femlic acid 9.6 
3,4-Dihydroxybenzaldehyde 2.0 
CaflFeic acid 1 9 



30 

V. planifolia OMT expression in different tissues 

Expression of the V. planifolia OMT in different tissues was evaluated by 
immunoblot analysis. As expected, the OMT protein detected in the tissue samples was 
slightly smaller than the purified recombinant His-tagged fiision protein. The highest OMT 
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protein level was in the root and tissue culture samples, with a lower level in the stem sample. 
No immunoreactive band at the size of the OMT was detected in the leaf or pod samples. 
The origin of the higher molecular weight bands observed in the stem and leaf samples is not 
known. The lack of an immunoreactive band in the pod tissue was unexpected and 
5 surprising, since both the pods and tissue cultures synthesize vanillin and both had DOMT 
activity at similar levels (Table 2). These results suggest that the DOMT activities detected 
in these tissues originate from distinct enzymes that do not exhibit antibody cross reactivity. 
If this OMT is involved in the synthesis of vanillin it must be present in the pods at low levels 
that are not detectable by immunoblot analysis of proteins from cmde extracts. Since DOMT 
10 activity was detectable in the pods, however, lack of an immunorective protein band suggests 
this OMT is not the main contributor to the observed activity. Although the V, planifolia 
OMT characterized here can convert 3,4-dihydroxybenzaldehyde to vanillin in vitro, the 
kinetic parameters and the tissue localization suggest its primary function is Ukely to be in 
lignin biosynthesis. 

15 

While certain of the preferred embodiments of the present invention have been 
described and specifically exemplified above, it is not intended that the invention be limited 
to such embodiments. Various modifications may be made thereto without departing fi-om 
the scope and spirit of the present invention, as set forth in the following claims. 
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